Spatial correlations in SIR epidemic models 
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Abstract: - We investigate the role of global mixing in epidemic processes. We first construct a simplified model of the SIR 
epidemic using a realistic population distribution. Using this model, we examine possible mechanisms for destruction of spatial 
correlations, in an attempt to produce correlation curves similar to those reported recently for real epidemiological data. We find 
that introduction of a long-range interaction destroys spatial correlations very easily if neighbourhood sizes are homogeneous. 
For inhomogeneous neighbourhoods, very strong long-range coupling is required to achieve a similar effect. 
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1 Introduction 

The main mechanism of transmission of infectious dis- 
eases is usually the direct contact of susceptible individ- 
uals with an infective one. Since this contact is normally 
highly localized in space, it is quite natural to expect that 
space should play an important role in dynamics of in- 
fectious diseases. There is a clear evidence that some in- 
fectious diseases in animal populations spread geographi- 
cally. A well-known example is the spatial advance of fox 
rabies in Europe, which seems to have started in Poland 
in 1939, and has moved steadily westward at a rate of 
30-60 km per year Similar patterns of spread have 
been observed in the epizootic of rabies among raccoons 
in eastern United States and Canada. It started in 1977 
in an area on the West Virginia- Virginia border and has 
moved at a rate of 30-40 km per year 

In human populations, the spread of the Black Death in 
Europe from 1347-1350 is the most often quoted example 
i3|- Introduced to Italy in 1347, it spread up through Eu- 
rope at 300-500 km per year. For other diseases the spatial 
effects appear to be somewhat less pronounced. For ex- 
ample, even though in the past influenza pandemics used 
to reveal spatial patterns [4], there is a very strong evi- 
dence that in recent times the spread of influenza is statis- 



tically uniform in space. Bonabeau et al. [5] examined the 
spatial correlation structure of the influenza epidemic for 
the epidemic of winter 1994-5, using high-quality data 
collected by a large network of general practitioners in 
France. They found that at least for influenza epidemics, 
space does not play any important role, and the spread of 
the disease is dominated by the mean-field dynamics. 

The goal of this paper is to investigate the role of global 
mixing in the spread of epidemics. We first construct a 
simplified model of the SIR epidemic based on realistic 
population distribution. Using this model, we investigate 
possible mechanisms for destruction of spatial correla- 
tions, trying to achieve similar effects as those reported 
by Bonabeau et al. 0. 

2 Description of the model 

In order to study the influence of global mixing on the 
spread of epidemics we use a model based on interact- 
ing particle systems, in the spirit of our earlier work 
H3 0. Models of this type take various forms, rang- 
ing from stochastic interacting particle models |8) to 
models based on cellular automata or coupled map lat- 
tices miioiiniin). 

Consider a set of N individuals, labelled with consecu- 
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tive integers 1,2 ... ,N. This set of labels will be denoted 
by C. We assume that each individual can be in three dis- 
tinct states, susceptible (S), infected (I) or removed (R). 
There are two ways to change the state of a single indi- 
vidual. A susceptible individual who comes in direct con- 
tact with an infected individual can become infected with 
probability p. Infected individual can become removed 
with probability q. The precise description of the model 
is as follows. 

The state of the z-th individual at the time step k will 
be described by a Boolean vector variable rj(i,k) = 
(Vs(h k),r]i(i, k),i] R (i, k)), where r) T (i,t) = 1 if the i- 
th individual is in the state r, where r € {S, I, R}, and 
rj T {i, k) = otherwise. We assume that i = 1,2, ... ,N 
and k € N, i.e., the time is discrete. Hence, the vector 
r)(i, k) can be in one of the following states: r)(i, k) = 
(1,0,0) for a susceptible individual, r){i,k) = (0,1,0) 
for an infected individual, and r](i,k) = (0,0, 1) for a 
removed individual. No other values of r/(i, k) are possi- 
ble in SIR epidemic model. In SIR model an individual 
can only be at one state at any give time and transitions 
occur only from susceptible to infected and from infected 
to removed. The removed does not become susceptible 
or infected again in SIR model. Therefore, SIR model is 
suitable for studying spread of influenza in the same sea- 
son because the same type of influenza virus can infect an 
individual only once and once the individual is recovered 
from the flu it becomes immune to this type of virus. 

We further assume that at the time step k the z-th indi- 
vidual can interact with individuals from a subset of C, to 
be denoted by C(i,k). Using this notation, we obtain 

ris{i,k + l) = r] S {i,k) X^^ij, k), (1) 

jeC(i,k) 

r]i(i,k + l) = rjs(i,k)(l- X i!jtk r]i(j,k)] 

jeO(i,k) 

+ Vl (i,k)Yi, (2) 
rj R (i, k + 1) = r] R (i,k) +r] I (i,k)Y i , (3) 

where Xij )k is a set of iid Boolean random variables such 
that Pr(X- J; k = l)=p, Pr{X hjtk = 0) = 1 - p, and y 4 
is a set of iid Boolean variables such that Pr(Yi = 1) = q, 
Pr(Yi =0) = l-q. 

Note that j k = 1 means that the disease has been 
transmitted from the j-th individual to the i-th individ- 
ual at time step k. If at least one of the random vari- 



ables X iijjk in the product UjeC{i,k) x i,j,kViU, k ) takes 
the value 1, then the product becomes 0, and we obtain 
r/s(i, k+ 1) = 0, meaning that the i-th individual changes 
its state from susceptible to infected. 

The crucial feature of this model is the set C(i,k), rep- 
resenting all individuals with whom the i-th individual 
may have interacted at the time step k. In a large human 
population, it is almost impossible to know C(i,k) for 
each individual, so we make some simplifying assump- 
tions. First of all, it is clear that the spatial distribution of 
individuals must be reflected in the structure of C(i,k). 
We have decided to use realistic population distribution 
for Southern and Central Ontario using census data ob- 
tained from Statistic Canada fl3l fT4ll - The selected region 
is mostly surrounded by waters of Great Lakes, forming 
natural boundary conditions. The data set specifies popu- 
lation of so called "dissemination areas" , i.e., small areas 
composed of one or more neighbouring street blocks. We 
had access to longitude and latitude data with accuracy of 
roughly 0.01°, hence some dissemination areas in densely 
populated regions had the same geographical coordinates. 
We combined these dissemination areas into larger units, 
to be called "modified dissemination areas" (MDA). 

We will now define the set C(i, k) using the concept 
of MDAs. This set will be characterized by two posi- 
tive integers n c and nj. Let us label all MDAs in the 
region we are considering by integers m = 1,2,..., M, 
where in our case M = 5069. For an individual i be- 
longing to the m-th MDA, the set C(i, k) consists of all 
individuals belonging to the m-th MDA, plus all individ- 
uals belonging to n c MDAs nearest to m, plus nj MDAs 
randomly selected among all remaining MDAs. While 
the "close neighbours", i.e., n c nearest MDAs, will not 
change with time, the "far neighbours", i.e., nf randomly 
selected MDAs, will be randomly reselected at each time 
step. 

3 Mean Field 

The model described in the previous section involves 
strong spatial coupling between individuals. Before we 
describe consequences of this fact, we will first construct 
a set of equations which approximate dynamics of the 
model under the assumption of "perfect mixing", i.e., ne- 
glecting the spatial coupling. 

The state of the system described by eq. Q]3l) at time 
step k is determined by the states of all individuals and is 
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described by the Boolean random field rj(k) = {rj(i, k) : 
i = 0,...,N}. The Boolean field {r](k) : i = 0, 1, 2 . . .} 
is then a Markov stochastic process. 

By taking the expectation -EV^o) of this Markov pro- 
cess when the initial configuration is ?7(0), i.e. p T (i, k) = 
£wo) [Vrihk)] for r £ {S, I, R}, we get the probabil- 
ities of the i-th individual being susceptible, infected or 
removed at time k. 

In the mean field approximation, we assume indepen- 
dence of random variables r/ T (i, k). Hence, the expected 
value of a product of such variables is equal to the prod- 
uct of expected values. Under this assumption, taking ex- 
pected values of both sides of equations Qj3j) we obtain 

Ps (i,k + 1) = Ps (i,k) H (l- PPl (j,k)), (4) 

jeC(i,k) 

Pl (i,k + 1) = Ps (i,k)(l- J] (l-ppi(j,k)j) 

+ Pl (i,k)(l-q), (5) 

p R (i,k + l) = p R (i,k) + Pl (i,k)q. (6) 

Since mean field approximations neglect spatial correla- 
tions, we further assume that p T (i, k) is independent of i, 
i.e., p T (i, k) = p T (k). Even though sets C(i, k) have dif- 
ferent number of elements for different i and k, for the 
purpose of this approximate derivation we assume that 
they all have the same number of elements {l+n c +rif)D, 
where D is the average MDA population. All these as- 
sumptions lead to 

Ps(k + l)=ps(k)(l- P pi(k))^ +n ^ D , (7) 

Pl {k + l) = Pl (k)+ Ps (k) 

-p s {k){l-p Pl {k))^ +n ^ D - q Pl (k), (8) 

PR {k + l) =p R (k)+q Pl {k). (9) 

The third equation in the above set is obviously redundant, 
since p s (k) + pi(k) + p R (k) = 1. 

Similarly to the classical Kermack-McKendrick model, 
mean field equations (Ql-© exhibit a threshold phe- 
nomenon. Depending on the choice of parameters, we 
can have pi(k) < pi(0) for all k, meaning that the in- 
fection dies out, or we can have an outbreak of the epi- 
demic, meaning that pi{k) > pi(0) for some k. The 
intermediate scenario of constant pi{k) will occur when 
pi(k) = pi(0), i.e., when 

ps(0) - p s {m-PPM) (1+nc+nf)D - qpi(0) = 0. 

(10) 



Assuming that initially there are no individuals in the 
removed group, we have ps(0) = 1-/3/(0). Fur- 
thermore, if (1 + n c + nf)D is large, we can assume 
(1 - ppj(0))( 1+nc+n /) D w 1 - p(l + n c + n f )D Pl {0). 
Solving eq. (flOl for q under these assumptions we obtain 

q=(l- Pl (0)yi + n c + n f )Dp. (11) 

Thus, assuming the mean field approximation the epi- 
demic can occur only if q < ^1 — p/(0)^ (l + n c +rif)Dp. 

4 Dynamics of the model 

The mean-field equations derived in the previous section 
depend only on the sum of n c and ?if. This means, for ex- 
ample, that the model with n c = 5, rij = and the model 
with n c = 0, n f = 5 will have the same mean field equa- 
tions. However, the actual dynamics of these two models 
will be very different. Depending on the relative size of 
n f and n c , the epidemic may propagate or die out, as we 
will see in the following analysis. 

Let N T (k) be the expected value of the total number of 
individuals belonging to class r G {S, I, R}, 

N T (k)=E v(0) (j^vAhk))^ . 

We say that an epidemic occurs if there exists k > such 
that Ni(k) > iV/(0). For fixed p, ?if and n c , there exists 
a threshold value of q to be denoted by q c , such that for 
each q < q c an epidemic occurs, and for q > q c it does not 
occur. Obviously q c depends on p, and this is illustrated in 
Figure \l\ which shows graphs of q c as a function of p for 
several different values of nf and n c , where nj+n c = 12. 
The graphs were obtained numerically by direct computer 
simulations of the model. The condition nj + n c = 12 
means that the size of the neighbourhood is kept constant, 
but the proportion of "far neighbours" (represented by nj) 
to "close neighbours" (represented by n c ) varies. Figure^ 
also shows the mean-field line given by eq. (fTTIi . 

5 Spatial correlations 

As demonstrated in the previous section, the relative size 
of parameters nf and n c controls dynamics of the epi- 
demic process in a significant way, shifting the critical 
line up or down. When n c = 0, i.e., when there are 
no "far neighbours", the epidemic process has a strictly 
local nature, and we can observe well defined epidemic 
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Figure 1: Graphs of critical lines for nj = 5, 9, 11, and 12. In all 
cases, 7i c = 12 — n/. Solid line represents mean field approximation. 



fronts propagating in space. This is illustrated in Figure 
12 where the epidemic starts at k = at a single centrally 
located MDA (Figure^), with n c = 12, n f = 0. Modi- 
fied dissemination areas are represented by pixels colored 
according to the density of individuals of a given type, 
such that the red component of the color represents den- 
sity of infected individuals, green density of susceptibles, 
and blue density of removed individuals. By density we 
mean the number of individuals of a given type divided by 
the population of the MDA. Epidemic wave propagating 
outwards can be clearly seen in subsequent snapshots (c), 
(d) and (e). The front is mostly red, meaning that the bulk 
of infected individuals is located at the front. After these 
individuals gradually recover, the center becomes blue. 

Let us now consider slightly modified parameters, tak- 
ing n c = 11, rif = 1. This means that we now replace 
one "close" MDA by one "far" MDA. This does not seem 
to be a significant change, yet the effect of this change 
is quite spectacular. As we can see in Figure |3j the epi- 
demic propagates much faster, and there are no visible 
fronts. The disease quickly spreads over the entire region, 
and large metropolitan areas become red in a short time, 
as shown in Figure |3jb). This suggests that infected indi- 
viduals are more likely to be found in densely populated 
regions, and their distribution is dictated by the population 
distribution - unlike in Figure 13 where infected individu- 
als are to be found mainly at the propagating front. 

In order to quantify this observation, we use a spatial 
correlation function for densities of infected individuals 
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Figure 2: Example of a propagating epidemic front for n c = 12, 
n f = 0, p = 0.00005, q = 0.05, with (a) k = 0, (b) k = 25, (c) k = 
50 and (d) k — 75. Modified dissemination areas are represented by 
pixels colored according to density of individuals of a given type, such 
that the red component represents density if infected, green density of 
susceptibles, and blue density of removed individuals. 



defined as 

h(r,k) = (Vl(hk)Tllti> k ))r<d{i,j)<r+Ar ' 

where d(i,j) is the distance between i-th and j-th indi- 
vidual, and < • > represents averaging over all pairs i, j 
satisfying condition r < d(i,j) < r + Ar. In subsequent 
considerations we will take Ar = 1 km. The distance be- 
tween two individuals is defined as the distance between 
MDAs to which they belong. 

Consider now a specific example of the epidemic pro- 
cess described by eq. (I1I3L where p = 0.000015, q = 
0.2, and n c + n/ = 12. For this choice of parameters 
epidemics always occur as long as n f > 0. Figure |4] 
shows graphs of the correlation functions h(r,k max ) at 
the peak of each epidemic, so that k max is the time step 
at which the number of infected individuals achieves its 
maximum value. An interesting phenomenon can be ob- 
served in that figure: while the increase of the proportion 
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Figure 3: Development of the epidemic for n c = 11, 71/ = 1, p = 
0.00005, g = 0.05, with (a) k = 0, (b) k = 15, (c) k = 25 and (d) 
A: = 50. Colour coding is the same as in the previous figure. 



of "far" neighbours does destroy spatial correlations, one 
needs very high proportion of "far"neighbours to make 
the correlation curve completely flat. In it is reported 
that for influenza epidemics h(r, k max ) ~ r o.04±o.03 jf 
we fit h(r,k max ) = Cr a curve to the correlation data 
shown in Figure^, we obtain values of the exponent a 
as shown in Table 1. In order to obtain a of compara- 
bly small magnitude as reported in [5 1, one would have to 
take rif equal to at least 10, meaning that 77% of neigh- 
bours would have to be "far neighbours". In reality, this 
would require that 77% of all individuals one interacted 
with were not his/her neighbours, coworkers, etc., but in- 
dividuals from randomly selected and possibly remote ge- 
ographical regions. This is clearly at odds with our intu- 
ition regarding social interactions, especially outside large 
metropolitan areas. This prompted us to investigate fur- 
ther and to find out what is responsible for this effect. 

Upon closer examination of spatial patterns generated 
in simulations of our model, we reached the conclusion 
that the inhomogeneity of population sizes in neighbour- 



Figure 4: Graphs of the correlation function h(r, k max ) for different 
values of n/, where p — 0.000015, q = 0.2, and n c + nf = 12. 



n c 


rif 


a 


11 


1 


-0.72 +/- 0.03 


10 


2 


-0.45 +/- 0.02 


9 


3 


-0.32 +/- 0.01 


8 


4 


-0.27 +/- 0.01 


7 


5 


-0.191 +/- 0.007 


6 


6 


-0.179 +/- 0.009 


5 


7 


-0.120 +/- 0.005 


4 


8 


-0.115+/- 0.006 


3 


9 


-0.071 +/- 0.003 


2 


10 


-0.057 +/- 0.002 


1 


11 


-0.047 +/- 0.003 



Table 1: Values of the exponent a obtained by fitting h(r, k max ) 
Cr a to simulation data. 



hoods C(i,k) makes spatial correlations so persistent. 
Since different MDAs have different population sizes, we 
expect that some individuals will have larger neighbour- 
hood populations than others, and as a result they will 
be more likely to get infected, even if the proportion of 
infected individuals is the same in all MDAs. This will 
build up clusters of infected individuals around populous 
MDAs. 

To test if this is indeed the factor responsible for strong 
spatial correlations in our model, we replaced all MDA 
population sizes with constant population size D, i.e., av- 
erage MDA population size. As expected, graphs of the 
correlation functions obtained in this case were are all es- 
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sentially flat, with the exponent a close to zero even in the 
case of nf = 1, when we obtained a = 0.023 ± 0.002. 

6 Conclusions 

As demonstrated in the previous section, spatial correla- 
tions are difficult to destroy if neighbourhood sizes are 
inhomogeneous. Very strong mixing, i.e., very signifi- 
cant amount of long-range interactions is required to ob- 
tain flat correlations curves. On the other hand, for homo- 
geneous neighbourhood sizes, even relatively small long- 
range interaction immediately forces the process into the 
perfect-mixing regime, resulting in the lack of spatial cor- 
relations. 

There is a strong evidence that in recent times epi- 
demics of influenza do not produce significant spatial cor- 
relations [ 5 1 in spite of the heterogeneity of the population 
distribution. One can speculate that this must be due to 
one of the following two factors: either most of our daily 
interactons are long-ranged, or conversely, most interac- 
tions are short-ranged, but the number of interactions per 
unit time does not vary too greatly from individual to in- 
dividual. 

We suspect that the answer depends on the social and 
economic structure of the underlying community or ad- 
jacent communities. The first scenario applies to large 
metropolitan areas and conurbations with a large number 
of commuters, while the second scenario is more likely 
for small communities without much interaction with the 
outside word. The above considerations also indicate that 
a more realistic model will have to separate two aspects 
of the interaction. First of all, with how many individuals 
does a given person interact with per unit time, e.g., per 
day, and how is this number distributed? This could con- 
ceivably be determined experimentally without much dif- 
ficulty. The second aspect is much harder, though: from 
what pool are these individuals chosen and how? While 
an accurate answer to this question does not seem to be 
possible, we hope to get at least some insight from anal- 
ysis of travel and traffic data. This work is currently in 
progress and will be presented elsewhere. 
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